74 research outputs found

    Superpixel-based Semantic Segmentation Trained by Statistical Process Control

    Full text link
    Semantic segmentation, like other fields of computer vision, has seen a remarkable performance advance by the use of deep convolution neural networks. However, considering that neighboring pixels are heavily dependent on each other, both learning and testing of these methods have a lot of redundant operations. To resolve this problem, the proposed network is trained and tested with only 0.37% of total pixels by superpixel-based sampling and largely reduced the complexity of upsampling calculation. The hypercolumn feature maps are constructed by pyramid module in combination with the convolution layers of the base network. Since the proposed method uses a very small number of sampled pixels, the end-to-end learning of the entire network is difficult with a common learning rate for all the layers. In order to resolve this problem, the learning rate after sampling is controlled by statistical process control (SPC) of gradients in each layer. The proposed method performs better than or equal to the conventional methods that use much more samples on Pascal Context, SUN-RGBD dataset.Comment: Accepted in British Machine Vision Conference (BMVC), 201

    Efficient and effective human action recognition in video through motion boundary description with a compact set of trajectories

    Get PDF
    Human action recognition (HAR) is at the core of human-computer interaction and video scene understanding. However, achieving effective HAR in an unconstrained environment is still a challenging task. To that end, trajectory-based video representations are currently widely used. Despite the promising levels of effectiveness achieved by these approaches, problems regarding computational complexity and the presence of redundant trajectories still need to be addressed in a satisfactory way. In this paper, we propose a method for trajectory rejection, reducing the number of redundant trajectories without degrading the effectiveness of HAR. Furthermore, to realize efficient optical flow estimation prior to trajectory extraction, we integrate a method for dynamic frame skipping. Experiments with four publicly available human action datasets show that the proposed approach outperforms state-of-the-art HAR approaches in terms of effectiveness, while simultaneously mitigating the computational complexity

    Tell Me What They're Holding: Weakly-supervised Object Detection with Transferable Knowledge from Human-object Interaction

    Full text link
    In this work, we introduce a novel weakly supervised object detection (WSOD) paradigm to detect objects belonging to rare classes that have not many examples using transferable knowledge from human-object interactions (HOI). While WSOD shows lower performance than full supervision, we mainly focus on HOI as the main context which can strongly supervise complex semantics in images. Therefore, we propose a novel module called RRPN (relational region proposal network) which outputs an object-localizing attention map only with human poses and action verbs. In the source domain, we fully train an object detector and the RRPN with full supervision of HOI. With transferred knowledge about localization map from the trained RRPN, a new object detector can learn unseen objects with weak verbal supervision of HOI without bounding box annotations in the target domain. Because the RRPN is designed as an add-on type, we can apply it not only to the object detection but also to other domains such as semantic segmentation. The experimental results on HICO-DET dataset show the possibility that the proposed method can be a cheap alternative for the current supervised object detection paradigm. Moreover, qualitative results demonstrate that our model can properly localize unseen objects on HICO-DET and V-COCO datasets.Comment: AAAI 2020 Oral Camera Read

    MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation

    Full text link
    We propose MAMo, a novel memory and attention frame-work for monocular video depth estimation. MAMo can augment and improve any single-image depth estimation networks into video depth estimation models, enabling them to take advantage of the temporal information to predict more accurate depth. In MAMo, we augment model with memory which aids the depth prediction as the model streams through the video. Specifically, the memory stores learned visual and displacement tokens of the previous time instances. This allows the depth network to cross-reference relevant features from the past when predicting depth on the current frame. We introduce a novel scheme to continuously update the memory, optimizing it to keep tokens that correspond with both the past and the present visual information. We adopt attention-based approach to process memory features where we first learn the spatio-temporal relation among the resultant visual and displacement memory tokens using self-attention module. Further, the output features of self-attention are aggregated with the current visual features through cross-attention. The cross-attended features are finally given to a decoder to predict depth on the current frame. Through extensive experiments on several benchmarks, including KITTI, NYU-Depth V2, and DDAD, we show that MAMo consistently improves monocular depth estimation networks and sets new state-of-the-art (SOTA) accuracy. Notably, our MAMo video depth estimation provides higher accuracy with lower latency, when omparing to SOTA cost-volume-based video depth models.Comment: Accepted at ICCV 202

    DIFT: Dynamic Iterative Field Transforms for Memory Efficient Optical Flow

    Full text link
    Recent advancements in neural network-based optical flow estimation often come with prohibitively high computational and memory requirements, presenting challenges in their model adaptation for mobile and low-power use cases. In this paper, we introduce a lightweight low-latency and memory-efficient model, Dynamic Iterative Field Transforms (DIFT), for optical flow estimation feasible for edge applications such as mobile, XR, micro UAVs, robotics and cameras. DIFT follows an iterative refinement framework leveraging variable resolution of cost volumes for correspondence estimation. We propose a memory efficient solution for cost volume processing to reduce peak memory. Also, we present a novel dynamic coarse-to-fine cost volume processing during various stages of refinement to avoid multiple levels of cost volumes. We demonstrate first real-time cost-volume based optical flow DL architecture on Snapdragon 8 Gen 1 HTP efficient mobile AI accelerator with 32 inf/sec and 5.89 EPE (endpoint error) on KITTI with manageable accuracy-performance tradeoffs.Comment: CVPR MAI 2023 Accepted Pape

    Prevalence of Human Papilloma Virus Infections and Cervical Cytological Abnormalities among Korean Women with Systemic Lupus Erythematosus

    Get PDF
    We performed a multicenter cross-sectional study of 134 sexually active systemic lupus erythematosus (SLE) patients to investigate the prevalence of and risk factors for high risk human papilloma virus (HPV) infection and cervical cytological abnormalities among Korean women with SLE. In this multicenter cross-sectional study, HPV testing and routine cervical cytologic examination was performed. HPV was typed using a hybrid method or the polymerase chain reaction. Data on 4,595 healthy women were used for comparison. SLE patients had greater prevalence of high-risk HPV infection (24.6% vs. 7.9%, P<0.001, odds ratio 3.8, 95% confidence interval 2.5-5.7) and of abnormal cervical cytology (16.4 vs. 2.8%, P<0.001, OR 4.4, 95% CI 2.5-7.8) compared with controls. SLE itself was identified as independent risk factors for high risk HPV infection among Korean women (OR 3.8, 95% CI 2.5-5.7) along with ≥2 sexual partners (OR 8.5, 95% CI 1.2-61.6), and Pap smear abnormalities (OR 97.3, 95% CI 6.5-1,456.7). High-risk HPV infection and cervical cytological abnormalities were more common among Korean women with SLE than controls. SLE itself may be a risk factor for HPV infection among Korean women, suggesting the importance of close monitoring of HPV infections and abnormal Pap smears in SLE patients
    corecore